Vol. 1; Iss 1; Year 201 3 



Intl. Jrnl. on Computer Science and Technologies 



Feature Extraction and Analysis of Breast Cancer 

Specimen 

Debnath Bhattacharyya 1 , Tai-hoon Kim 2 , Samir Kumar Bandyopadhyay 3 



Computer Science and Engineering Department, 
Heritage institute of Technology, Kolkata-700107, India 

2 Hannam University, Daejeon - 306791, Korea 

3 Department of Computer Science and Engineering, 
University of Calcutta, Kolkata-700009, India 



Abstract- In this paper, we propose a method to identify abnormal growth ofj breast tissue and 

suggest further pathological test, if necessary. We compare normal breast tiss^i^tyith malignant invasive 
breast tissue by a series of image processing steps. Normal ductal epithrfS^^lls and ductal / lobular 
invasive carcinogenic cells also consider for comparison here in this j^Aji fact, features of cancerous 
breast tissue (invasive) are extracted and analyses with normal bre^xS^lue. We also suggest the breast 
cancer recognition technique through image processing and prevenfcoVby controlling P53 gene mutation 
to some greater extent. 

Q> 

1. Introduction 

that cannot be felt but can be seen on a conventiofSh^iammogram or with ultrasound. One type of needle 
biopsy, the stereotactic-guided biopsy, invoh*e^wie precise location of the abnormal area in three 
dimensions using conventional mammography. Stereotactic refers to the use of a computer and scanning 
devices to create three-dimensional imagflff^rceedle is then inserted into the breast and a tissue sample is 
obtained. Additional samples can be ob(5i™d by moving the needle within the abnormal area [2]. 

Another type of needle biopsy usc^^lifferent system, known as the Mammotome breast biopsy system. 
The FDA (Food and Drug Administration) approved Mammotome in 1996; the hand-held version of the 
Mammotome received FD^dj^rJhce in September 1999. A large needle is inserted into the suspicious area 
using ultrasound or ster^gfc^j^ guidance. The Mammotome is then used to gently vacuum tissue from the 
suspicious area. AddititoaVtissue samples can be obtained by rotating the needle. This procedure can be 
performed with the ^Sfcy?t lying on her stomach on a table. If the hand-held device is used, the patient may 
lie on her back oifmSjfeeated position. 





There have ljfce\no reports of serious complications resulting from the Mammotome breast biopsy system. 
Women i^^ted in this procedure should talk with their doctor. 

Mammography is a technique for recording x-ray images in computer code instead of on x-ray film, 
conventional mammography. The images are displayed on a computer monitor and can be 
(lightened or darkened) before they are printed on film. Images can also be manipulated; the 
radiologist can magnify or zoom in on an area. From the patient's perspective, the procedure for a 
mammogram with a digital system is the same as for conventional mammography [2]. 

Digital mammography may have some advantages over conventional mammography. The images can be 
stored and retrieved electronically, which makes long-distance consultations with other mammography 
specialists easier. Because the images can be adjusted by the radiologist, subtle differences between tissues 
may be noted. The improved accuracy of digital mammography may reduce the number of follow up 
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procedures. Despite these benefits, studies have not yet shown that digital mammography is more effective 
in finding cancer than conventional mammography. 

The first digital mammography [i] system received U.S. Food and Drug Administration (FDA) approval in 
2000. An example of a digital mammography system is the Senographe 2000D. Women considering digital 
mammography should talk with their doctor or contact a local FDA-certified mammography center to find 
out if this technique is available at that location. Only facilities that have been certified to practice 
conventional mammography and have FDA approval for digital mammography may offer the digital systej 
Many more techniques are available other than the cytogenetic processes, however, these are ir 
technologies to detect, diagnose, and characterize breast. /"V^ 

cr 



system. 



2. Previous works 

Numerous promising approaches are coming up, few of those only stated here out of 
are very recent. 



5^3y, 



-tim^iStipulation of a tumor 
£enVme appropriate force to 
possible to manipulate a 
such that the tumor did not 
ntial to reduce the number of 
ed tissue damage, improved speed 



and these 



V. Mallapragada, et al, October, 2007, presented [3, 7] a new concept for real 
using a robotic force controller that monitored the image of the tumor to 
position the tumor at a desired location. The idea was to demonstrate tfi. 
tumor in real-time by applying controlled external force in an auto 
deviate from the path of the needle. The success of this approach ha^ 
attempts a surgeon make to capture the desired tissue specimen 
of biopsy, and reduced patient discomfort. 

Cigdem Gunduz, et al, 2004, reported a computational mglhod that modeled a type of brain cancer using 
topological properties of cells in the tissue image. Thq^^Pfttructed the graphs based on the locations of 
cells within the image. They used the Waxman model%Aheir experiment [4]. 



C. Cagatay Bilgin, et al, 2007, classified [5] the 
approach was used and Euclidean Dista 
generated by considering the cell locati 
Cigdem Gunduz, et al, 2004. 



tanfles ^ 



ancer tissues using graph theory. Image segmentation 
we're calculated between vertices [5]. Cell Graphs were 
proach was same to the greater extent with the work of 





simultaneous capturing of ultrasound (US) and magnetic resonance (MR) 
fmation obtained from both modalities. An MR-compatible US system where 
a known orientation with respect to the US imaging plane and concurrent real- 
'chieved. Compatibility of the two imaging devices was a major issue in the physical 
formed to quantify the radio frequency (RF) noise introduced in MR and US images, 



These approaches toward automa^f^etection of cancer were actually failed because the types of cancers 
identified more and more. 

A.M. Tang, et al, 2008, 
images allowed fusion 
MR images were ao 
time imaging co 
setup. Tests w< 

with the US^^e^m used in conjunction with MRI scanner of different field strengths (0.5 T and 3 T). 
Furtherm^a^Mmultaneous imaging was performed on a dual modality breast phantom in the 0.5 T open 
bore a^JN^Tclose bore MRI systems to aid needle-guided breast biopsy. Fiducial based passive tracking and 
e^tfcnygnetic based active tracking were used in 3 T and 0.5 T, respectively, to establish the location and 
^rierjation of the US probe inside the magnet bore. Their results indicated that simultaneous US and MR 
inVging were feasible with properly-designed shielding, resulting in negligible broadband noise and 
minimal periodic RF noise in both modalities. US could be used for real time display of the needle 
trajectory, while MRI could be used to confirm needle placement [6]. 

C. Zhu, et al, 2009, have explored [8] the use of a fiber-optic probe for in vivo fluorescence spectroscopy of 
breast tissues during percutaneous image-guided breast biopsy. A total of 121 biopsy samples with 
accompanying histological diagnosis were obtained clinically and investigated in their study. The tissue 
spectra were analyzed using partial least-squares analysis and represented using a set of principal 
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components (PCs) with dramatically reduced data dimension. For nonmalignant tissue samples, a set of 
PCs that account for the largest amount of variance in the spectra displayed correlation with the percent 
tissue composition. For all tissue samples, a set of PCs was identified using a Wilcoxon rank-sum test as 
showing statistically significant differences between: i) malignant and fibrous/benign; 2) malignant and 
adipose; and 3) malignant and nonmalignant breast samples. These PCs were used to distinguish malignant 
from other nonmalignant tissue types using a binary classification scheme based on both linear and 
nonlinear support vector machine (SVM) and logistic regression (LR). For the sample set investigated in 
this study, the SVM classifier provided a cross-validated sensitivity and specificity of up to 81% and 8i&, 
respectively, for discrimination between malignant and fibrous/benign samples, and up to 81% and/fl^p^ 
respectively, for discriminating between malignant and adipose samples. Classification based on^^|yas 
used to generate receiver operator curves with an area under the curve (AUC) of 0.87 for disppifciu^ting 
malignant versus fibrous/benign tissues, and an AUC of 0.84 for discriminating malignant frVjjJadipose 
tissue samples. This study demonstrated the feasibility of performing fluorescence stteqffi&cfrpy during 
clinical core needle breast biopsy, and the potential of that technique for identifying ^^^malignancy in 
vivo. 



Lin Yang, et al, 2007, introduced a Grid-enabled CAD to perform autqrtfctrk analysis of imaged 
histopathology breast tissue specimens [10]. More than 100,000 digitized sam|aleWi200 x 1200 pixels) were 
processed on the Grid. They analyzed results for 3744 breast tissue sa^p^esVwnich were originated from 
four different institutions using diaminobenzidine (DAB) and he^rwDS^Wn staining. Both linear and 
nonlinear dimension reduction techniques were compared, and th^fc^l^ne was applied to reduce the 
dimensionality of the features. The results shown that the Genrojfi^sting using an eight node CART 
decision tree as the weak learner provided the best result for dwSEtation. The algorithm has an accuracy 
of 86.02% using only 20% of the specimens as the training set. if 





We used free Tissue Blocks downloaded from OriKqaeTechnologies, Inc, 2009 [9] . Here in our experiment, 
18 invasive breast cancer tissues from differeniS* patients and 8 non-cancerous falsely detected breast 
tissues from 8 different normal females aj^^l^risidered. Each of the 24-bit BMP Image size is 640 x 480 



Pixels. 



3.1. 24-Vi^oIor Image to 256-color Gray Image 



1. Take this 24-Bit BMP ^lejts Input file and open the file in Binary Mode, (Size MxM). 

2. Copy the Imag^n^^First 54 byte) of the Header from Input 24-Bit Bmp file to a newly created 
BMP file and e(MtN2t!n4eader by changing filesize, Bit Depth, Colors to confirm to 8-Bit BMP. 

3. Copy the CcjPSiMSle from a sample gray scale Image to this newly created BMP at 54th Byte place 
on words^^ 

4. ConverM^RGB value to Gray Value using the following formula: 



ords^^W 

^e>^|jkc 

^^^^mie Value = (o.299*redValue + o.587*greenValue + o.ii4*blue Value); 
^AdV greenValue = (o.299*redValue + o.587*greenValue + o.ii4*blue Value); 

C^^c. redValue = (o.299*redValue + o.587*greenValue + o.ii4*blueValue); 
J d. gray Value = blue Value = greenValue = redValue; 
Write to new BMP file. 



Take 24-bit BMP color image as input. Then convert it to 256-color Gray Scale image by following this 
algorithm. This 256-color Gray Scale image is the output of the algorithm. In this algorithm, first read the 
red, blue and green value of each pixel and then after formulation, three different values are converted into 
gray value, stated in Step 4. 
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3.2. 256-color Gray Image to Bi-color (using Pixel Clustering on Threshold Value, T) 

1. Open 256-color Image (Size MxM) 

2. Read a Pixel value 

3. If the Pixel Intensity value less than or equal to T (128) then make it o Else make it 255 and write 
into same Pixel Location 

4. Go to Step 2 until end of file 

5. Close file 

<v 

This algorithm is actually used here to convert the Gray Image to Bi-color (Monochrome Image). >Tl^me 
cases we can say this is the Edge Detection Algorithm set on a Threshold Value. 

3.3. Cell Representation Algorithm on Spatial Domain V * 

1. Open Bi-color Image (Size MxM) /V 
Set a 2D Integer Array (equivalent to size of Bi-color Image, MxM) # 



Read a Pixel value 

Store corresponding location of 2D Array (If the Pixel value is 255, m^y^i in our case) 
Go to Step 2 until end of file 
Close file 

Draw the Graph on 2D Space using that generated Binary M 
End 



1^ 



The Generated Binary Matrix can be used for future statisticar analysis to make the system automatic, 
definitely, with other biological characteristics of BreasltS^cer Cells. Here in this work we compare the 
those Graphs and suggest for further pathological test^vo^no need of test. 



is and Result 

Here the challenge is Mammogram an^Dlgital Biopsy. Problem with mammogram may arise biopsy also. 
Now we are considering some kind o#n^j^mogram analysis. We have noticed same problem with Biopsy. 
In most individuals the bulk of th^^iast extends from the second to the seventh rib. Since breast tissues 
often curve around the laterAmargm of the pectoralis major muscle (Figure 1), the orientation of the 
muscle is important for optijjAy^ammographic positioning. The pectoralis major muscle spreads like a fan 
across the chest wall. P(Ati|^bf the pectoralis major muscle attach to the clavicle, the lateral margin of the 
scapula, costal cartilag^ajd the aponeurosis of the external oblique muscles of the abdomen. All these 
fibers converge on ^ySgttach to the greater tubercle of the humerus. The free fibers predominantly run 
obliquely over^h^kerf from the medial portion of the thorax toward the humerus. The relationship of the 
breast to th^^Woralis major muscle influences two-dimensional projectional imaging, such as 
mammogra^hv\iince the breast tissue is closely applied to the muscle, some of the lateral tissues can only 
be ima^e^^rough the muscle. As with any soft-tissue structure overlying muscle, it is easier to project the 
breas^flM^the field of view by pulling it away from the chest wall and compressing it with the plane of 
lion along the obliquely oriented muscle fibers of the pectoralis major muscle. In order to 
lize the tissue imaged, the free portion of the muscle should be included in the field of view. 




In view of the enormous amount of work that has been done in an effort to understand the breast and the 
development of breast cancer, it is surprising that the normal breast has never been clearly defined. This is 
likely due to the fact that since breast cancer is really the only significant abnormality that occurs in the 
breast, it is really only the changes that appear to predispose to breast cancer that are considered 
significant. There is a large range of histologic findings that occur in women who never develop breast 
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cancer, but where normal ends and abnormal begins is not obvious, and past classifications have been 
found to be inaccurate. 

The ability to detect breast cancers earlier requires high-quality imaging, proper film processing, systematic 
review of the images, reasoned interpretation, the ability to solve problems raised by the imaging, and the 
ability to guide the diagnostic removal of cells or tissue for diagnosis. The interpreter should participate in 
all aspects of this process. It is very important that quality control be supervised by the interpreter(s) of the 
images so that any image degradation can be detected and corrected as quickly as possible. 

Errors can be reduced by following a carefully structured approach to the process. The detectiotJ^ttW 
diagnosis of breast cancer can be divided into five very specific tasks: Detection — Find it. Verific^»nVTs it 
real? Triangulation — Where is it? Identification — What is it? Management — What should be 
it? 



oite about 



Figure i. Computed Breast tomography with the br^ 
left adjacent to the pectorals ma) 




the pendent position shows breast tissue on the 
scle extending up toward the axilla. 



Ductal Cancer can spread up and down thj/cSjct network and remain in situ, whereas invasive cancer can 
be found associated with a part of thedtcSss. This finding would support the continuum theory. Their 
data suggest that one of the already s^ieycally unstable cells in the duct developed an invasive clone and 
that this clone proliferated while fc^cernaining in situ cells, unable to invade, continued to proliferate and 
spread up and down the duct^ThisSoDservation explains invasive breast cancer can be found in the same 
lesion (Figure 2). In figure 2, {ur^de the ducts and lobules a huge amount of breast muscle and tissue are 
present and here is the < 

An understanding o^^fo^f tissue patterns as they apply to the sensitivity of mammographic detection of 
breast malignancy r j^yytortant . The greater the amount of fat within the breast, the easier it is to recognize 
a water-densi1\t{m«r (Figure 3). As in any other x-ray study, the margins of a water-density cancer will be 
obscured or^^^ole when they are contiguous with normal tissue of equivalent x-ray attenuation. In 
breasts in vS^n the parenchyma is nonuniform, the x-ray attenuation will vary in a nonuniform way, 
making^^(micult to detect a small cancer whose margins are similarly nonuniform. In the breast that is 
hetei^g^jneously dense or extremely dense, the sensitivity of mammography, not only for the early 
fet^pOn of malignancy, but also for large cancers is somewhat diminished because of the difficulty of 
ig ill-defined cancers within the inhomogeneous background. 

The fact that mammography can detect very small cancers but can also miss some very large cancers is 
confusing to clinicians and the public. Figure 4-nare useful for explaining how mammography can detect 
many very small cancers, but some large palpable cancers can still be difficult to image. 



The dense breast is not the only reason for overlooking cancers. It is of some interest that among cancers 
overlooked in the screening study many cancers were overlooked in women with predominantly fatty 
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breast tissue. Detecting small cancers in the dense breast is more difficult, but early-stage breast cancer can 
be detected by mammography among these women. In a review of 118 women with breast cancer detected 
by mammography alone32, among women under the age of 50 years, we found that 70% were detected in 
women with radiographically dense breast tissue and these were at a smaller size and earlier stage than 
among women with palpable cancers. Even though a higher proportion of younger women have dense 
tissues, recent data from modern mammography screening programs show that mammography can detect 
early cancers among women aged 40 to 49 years at the same proportion as for women aged 50 to 59 years33, 
34. The dense breast does reduce the sensitivity of mammography somewhat, but should not d^ 
screening among these women and is not the sole cause for overlooking breast cancers. 




^^^^ 







6 



Figure 2. Cells that are proliferating out of conltoAbut lack the ability to invade may continue to grow 
within the duct while a clone that has develope?fr\ivasive capability can be growing simultaneously in the 
same lesion. 




Figure 3. 



invasive ductal carcinoma is easily visible because it is surrounded by fat tissue. 





Figure 4. 



Figure 5 
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Figure n. 




The prqjOT^rTof a potted plant onto the wall using a spotlight (Figure 4) is a good analogy to the breast 
and (ifflmer detection by clinical breast examination and mammography. Assume that a chestnut, with its 
lell (Figure 5), is placed in among the branches and leaves of the plant (Figure 6). If the leaves are 
ily packed, the nut even a very large one may not be visible (Figure 7), yet fingers pressed against it can 
ly feel it (Figure 8). If the plant has fewer leaves, analogous to the breast with less fibrous tissue, then 
the nut becomes more visible (Figure 9). If there are few leaves, then even a very small nut is visible (Figure 
10), and if an extremely small nut is nestled between the rigid stems of the plant, the nut may be easily 
visible, but not palpable because it is protected by the stems (Figure 11). 



Our algorithms, specially first and second, are used to remove the huge amount of tissue and fat from the 
Cancerous cells within the biopsy samples, here we are naming these as tissue blocks, shown in Figure 12. 
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Our target is to get the image something like in Figure n. Outputs of those algorithms are shown in Figure 
i2a-c, for normal breast tissue. Figure 12c, shows the cells with black spotted on the space. 




t ■ 



.y \ 



L\5 V 



mm **m x V 



■ * V 



Figure 12a. 24-bit Color Image Figure 12b. 256-Color Gray Image 




color Color Image 





Figure 13a. 24-bit Color Image Figujjeijb. 256-Color Gray Image Figure 13c. Bi-color Color Image 

Figure i3a-c, shows the Cancerofc^Wjis; some kind of abnormal size and numbers are marked. These 
outputs also from the same ^t ofVlgorithms. We have conducted the observations using 18 different 
patients tissue block all are inCislte situ breast cancer and 8 normal breast tissue blocks. 



Graphical observations ^[s<j 
cell graphs have been ii 
tissue uncounted co 



ite,#iAis* ol 



meted as shown in Figure 14 and 15. In case of normal tissue, disconnected 
fed with few numbers. On the other hand, in case of invasive breast cancer 
cell graphs are observed. 

5. Conclusion 



Till date,*Ais> observed genetic mutation of certain oncogenes is responsible for any type of cancers. 
Modej^^hniques are using for treatment and chemotherapy is an established way of controlling cancers 
awiLxjavs. But, the question is that "Why the oncogenes are suddenly changing their behavior or 
^ecoYiing inactivated"? Next we will put more effort on genetical behavior of cancer genes and how these 
canroe tuned that leads to more biometric. 
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Figure 14. Invasive situ breast cancer tissue with cells in graphical problem space with dotted signs. 
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